branch-3.1: [Fix](cloud) calc_sync_versions should consider full compaction #55630#55710
Merged
morrySnow merged 1 commit intoapache:branch-3.1from Sep 5, 2025
Merged
Conversation
…ache#55630) Currently, `MetaServiceImpl::get_rowset` use `calc_sync_versions` to eliminate unnecessary version ranges when BE sync rowset metas. One of the optimizations is as the following: ```cpp std::vector<std::pair<int64_t, int64_t>> calc_sync_versions(int64_t req_bc_cnt, int64_t bc_cnt, int64_t req_cc_cnt, int64_t cc_cnt, int64_t req_cp, int64_t cp, int64_t req_start, int64_t req_end) { // ... if (req_cc_cnt < cc_cnt) { Version cc_version; if (req_cp < cp && req_cc_cnt + 1 == cc_cnt) { // * only one CC happened and CP changed // BE [=][=][=][=][=====][=][=] // ^~~~~ req_cp // MS [=][=][=][=][xxxxxxxxxxxxxx][=======][=][=] // ^~~~~~~ ms_cp // ^____________^ related_versions: [req_cp, ms_cp - 1] // cc_version = {req_cp, cp - 1}; } else { // ... } ``` This optimization replies on the assumption that only cumulative compaction will change the cumulative point. However, full compaction can also change the cumulative point, which breaks the above replied assumption. This will cause data correctness problem in multi-cluster environment because it will make the tablet failed to sync some rowset metas forever. A data correctness problem has been observed in the following situaitions: 1. For a certain tablet, base_compaction_cnt=14, cumulative_compaction_cnt=804, cumu_point=7458. On node A of the write cluster (cluster 0), a full compaction of [2-7464] and a cumulative compaction of [7465-7486] were performed. The stats then became base_compaction_cnt=15, cumulative_compaction_cnt=805, cumu_point=7465. 2. On node B of the read cluster (cluster 1), during sync_rowset, we have: req_base_compaction_cnt=14, base_compaction_cnt=15, req_cumulative_compaction_cnt=804, cumulative_compaction_cnt=805, req_cp=7458, cp=7465, req_start=7487, req_end=int_max. 3. calc_sync_version computes that the rowsets to be pulled are [0-7464] and [7487-int_max], but it misses the rowset [7465-7486] produced by cumulative compaction. 4. Moreover, since the max_version of the tablet on cluster 1 node B has been updated, subsequent sync_rowset operations will also not pull the rowset [7465-7486]. 5. This causes duplicate keys problem on MOW table because new rowset will generate delete bitmap marks on [7465-7486]. --- This PR forbids the above optimization when full compaction cnt is changed. None - Test <!-- At least one of them must be included. --> - [x] Regression test - [x] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
FE UT Coverage ReportIncrement line coverage `` 🎉 |
TPC-H: Total hot run time: 32827 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-DS: Total hot run time: 192921 ms |
ClickBench: Total hot run time: 29.23 s |
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
Author
|
run p0 |
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
morrySnow
approved these changes
Sep 5, 2025
calc_sync_versions should consider full compaction (#55630)calc_sync_versions should consider full compaction #55630
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
pick #55630